Cytochrome P450 site of metabolism prediction from 2D topological fingerprints using GPU accelerated probabilistic classifiers

نویسندگان

  • Jonathan D. Tyzack
  • Hamse Y. Mussa
  • Mark J. Williamson
  • Johannes Kirchmair
  • Robert C. Glen
چکیده

BACKGROUND The prediction of sites and products of metabolism in xenobiotic compounds is key to the development of new chemical entities, where screening potential metabolites for toxicity or unwanted side-effects is of crucial importance. In this work 2D topological fingerprints are used to encode atomic sites and three probabilistic machine learning methods are applied: Parzen-Rosenblatt Window (PRW), Naive Bayesian (NB) and a novel approach called RASCAL (Random Attribute Subsampling Classification ALgorithm). These are implemented by randomly subsampling descriptor space to alleviate the problem often suffered by data mining methods of having to exactly match fingerprints, and in the case of PRW by measuring a distance between feature vectors rather than exact matching. The classifiers have been implemented in CUDA/C++ to exploit the parallel architecture of graphical processing units (GPUs) and is freely available in a public repository. RESULTS It is shown that for PRW a SoM (Site of Metabolism) is identified in the top two predictions for 85%, 91% and 88% of the CYP 3A4, 2D6 and 2C9 data sets respectively, with RASCAL giving similar performance of 83%, 91% and 88%, respectively. These results put PRW and RASCAL performance ahead of NB which gave a much lower classification performance of 51%, 73% and 74%, respectively. CONCLUSIONS 2D topological fingerprints calculated to a bond depth of 4-6 contain sufficient information to allow the identification of SoMs using classifiers based on relatively small data sets. Thus, the machine learning methods outlined in this paper are conceptually simpler and more efficient than other methods tested and the use of simple topological descriptors derived from 2D structure give results competitive with other approaches using more expensive quantum chemical descriptors. The descriptor space subsampling approach and ensemble methodology allow the methods to be applied to molecules more distant from the training data where data mining would be more likely to fail due to the lack of common fingerprints. The RASCAL algorithm is shown to give equivalent classification performance to PRW but at lower computational expense allowing it to be applied more efficiently in the ensemble scheme.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Expression of cytochrome P450 and glutathione S-transferase in human bone marrow mesenchymal stem cells

Currently several studies are being carried out on various properties of mesenchymal stem cells (MSCs)however there are a few investigations about drug metabolizing properties of these cells. The aim of thisstudy was to measure the key factors involved in drug metabolism in human bone marrow MSCs. For thispurpose, cellular glutathione (GSH), glutathione Stransferase (GSTs) and...

متن کامل

The SMARTCyp cytochrome P450 metabolism prediction server

SUMMARY The SMARTCyp server is the first web application for site of metabolism prediction of cytochrome P450-mediated drug metabolism. AVAILABILITY The SMARTCyp server is freely available for use on the web at www.farma.ku.dk/smartcyp where the SMARTCyp Java program and source code is also available for download. CONTACT [email protected]; [email protected] SUPPLEMENTARY INFORMATION Supp...

متن کامل

In Silico Prediction of Cytochrome P450-Drug Interaction: QSARs for CYP3A4 and CYP2C9

Cytochromes P450 (CYP) are the main actors in the oxidation of xenobiotics and play a crucial role in drug safety, persistence, bioactivation, and drug-drug/food-drug interaction. This work aims to develop Quantitative Structure-Activity Relationship (QSAR) models to predict the drug interaction with two of the most important CYP isoforms, namely 2C9 and 3A4. The presented models are calibrated...

متن کامل

2D SMARTCyp Reactivity-Based Site of Metabolism Prediction for Major Drug-Metabolizing Cytochrome P450 Enzymes

Cytochrome P450 (CYP) 3A4, 2D6, 2C9, 2C19, and 1A2 are the most important drug-metabolizing enzymes in the human liver. Knowledge of which parts of a drug molecule are subject to metabolic reactions catalyzed by these enzymes is crucial for rational drug design to mitigate ADME/toxicity issues. SMARTCyp, a recently developed 2D ligand structure-based method, is able to predict site-specific met...

متن کامل

P-192: Association of Cytochrome P450 2D6 (CYP2D6) Gene Polymorphism with Clomiphene Citrate Treatment in Iranian Infertile Women with Polycystic Ovary Syndrome

Background: Clomiphene Citrate (CC) is the most frequently administered drug for the treatment of female infertility [e.g. polycystic ovary syndrome (PCOS)]; which aims at restoring ovulation. Clomiphene is metabolized by CYP2D6, an important enzyme responsible for the metabolism of approximately 25% of clinically used drugs. CYP2D6 is very polymorphic and thought to result in inter- individual...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 6  شماره 

صفحات  -

تاریخ انتشار 2014